Full-Text Search Engines for Databases

نویسندگان

  • László Kovács
  • Domonkos Tikk
چکیده

Current databases are able to store several Tbytes of free-text documents. The main purpose of a database from the user’s viewpoint is the efficient information retrieval. In the case of textual data, information retrieval mostly concerns the selection and the ranking of documents. The selection criteria can contain elements that apply to the content or the grammar of the language. In the traditional database management systems (DBMS), text manipulation is restricted to the usual string manipulation facilities, i.e. the exact matching of substrings. Although the new SQL1999 standard enables the usage of more powerful regular expressions, this traditional approach has some major drawbacks. The traditional string-level operations are very costly for large documents as they work without task-oriented index structures. The required full-text management operations belong to text mining, an interdisciplinary field of natural language processing and data mining. As the traditional DBMS engine is inefficient for these operations, database management systems are usually extended with a special full-text search (FTS) engine module. We present here the particular solution of Oracle; there for making the full-text querying more efficient, a special engine was developed that performs the preparation of full-text queries and provides a set of language and semantic specific query operators.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of full-text searching to metadata searching for genes in two biomedical literature cohorts

also is significantly lower than that of metadata searching. Certain features of articles correlated with higher relevance ratings. A significant feature measured was the number of matches of the search term in the full-text of the article, with a larger number of matches having a statistically significant higher usefulness (i.e., relevance) rating. By using the number of hits of the search ter...

متن کامل

Keywords and RDF Fragments: Integrating Metadata and Full-Text Search in Beagle++

Full-text search engines and metadata repositories have so far investigated very different approaches to search, mainly due to their separate and different storage systems for information and data. As we have argued in previous papers, though, integrating full-text and metadata search capabilities is crucial for powerful semantic desktop search systems [3]. Semantic metadata is able to represen...

متن کامل

Evaluating the Effectiveness of Keyword Search

The prevalence of free text search in web search engines has inspired recent interest in keyword search on relational databases. Whereas relational queries formally specify matching tuples, keyword queries are imprecise expressions of the user’s information need. The correctness of search results depends on the user’s subjective assessment. As a result, the empirical evaluation of a keyword ret...

متن کامل

Investigation on Full-Text Databases Cited in LIS

Background and Aim: The main objective of this research was to investigate the use of full-text databases in the LIS theses of Tehran State Universities within the years 2005 and 2009. Method: For this purpose, the total of 9952 citations related to 172 existing theses in the academic central libraries were studied. The data collected were analyzed by the bibliometrics and citation analysis met...

متن کامل

How to find the best evidence.

The Internet has made finding evidence for clinical practice fairly easy. Many different types of databases that can be searched for relevant key terms are available for free or for subscription. Bibliographic or library databases contain books, book chapters, reports, citations, abstracts, and either the full text of the articles indexed or links to the full text. Citation databases are specia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009